Cleveland’s Changing Population

Exploring the US Census with R

Published

February 1, 2025

The resurgence of people moving to downtown Cleveland is making news.1 According to a study commissioned by Downtown Cleveland Inc., the downtown population was almost 19,000 in the 2020 census, a 22% increase from 2010.2 However, Cleveland Open Data shows only 13,0003. Cleveland Scene reports that there are lots of estimates out there, one as low as 8,000!4 What gives? The organizations may be using different sources, like the decennial US census vs the more recent, but less comprehensive, American Community Survey. But it seems more likely they are using different geographic boundaries.

I was able to reproduce some estimates. My main tools to do this were the tidycensus R package for US Census data, and the Cleveland Open Data service for Cleveland neighborhood definitions. I’ll step through the process below.

Note

This is a work file / tutorial. Researching Cleveland’s population is mostly a toy project to experiment with R tools that work with APIs. This should come in handy for some future project. If you are not me, I hope this helps with whatever you’re doing. Otherwise, ‘hello, future me!’ You can find the source code and downloaded data on my GitHub page.

Defining “Downtown”

Cleveland extends from Cleveland Hopkins Airport on the west all the way to Euclid on the east. It’s mostly bounded on the south by I-80. Here is the map from the Cleveland Wikipedia Page.

Screen capture from Cleveland article on Wikipedia.

Screen capture from Cleveland article on Wikipedia.

The 2020 US decennial census counted 372K people in Cleveland.5 That’s a decline from 397K in 2010. The 1-year American Community Survey (ACS) shows it is still falling, down to 363K in 2023.6 But the decline is uneven, and parts of the city are actually growing, including the downtown area. There is no official definition of downtown, so we can make some choices. The Census Bureau provides the building blocks for a definition: over 15K census blocks in Cleveland, rolled up to around 200 census tracts.

Cleveland’s City Planning Commission (CPC) defines 34 neighborhoods for urban planning initiatives.7 They are commonly referred to Statistical (or Social) Planning Areas (SPAs). I pasted a pdf map from the CPC below. You can see there is an SPA actually named “Downtown”. It’s bounded by the Cuyahoga River and I-90. Cleveland Open Data has an interactive map that you can explore and download. I downloaded and extracted its shapefile to my local drive.

Screen capture from City Planning Commission 2010 Census pdf.

Screen capture from City Planning Commission 2010 Census pdf.

So that is one definition. A second one comes from a study by Urban Partners that was commission by Downtown Cleveland, Inc. in 2023. Page 3 of the pdf report (copy/pasted below) shows a Westside and a Downtown Core. Whereas the Downtown SPA had about 13.3K people in the 2020 census, this Downtown Core had 18.7K people. The main differences are that Urban Partners took a bite out of the Central neighborhood on the east side, and parts of the West Bank of the Flats in the Cuyahoga Valley and Ohio City neighborhoods on the west side.

Downtown Cleveland Market Study Report, p3. Urban Partners.

Downtown Cleveland Market Study Report, p3. Urban Partners.

Blocks, Tracts, and Subdivisions

Let’s gather the materials to segment population estimates into these boundaries. Several R libraries make it easy to work with census data. The tidycencus package was developed to interface with the US Census Bureau APIs. It also returns feature geometries for spatial analysis. The tigris package works with the Census Bureau’s TIGER/Line shape files, and the sf (simple features) package performs spatial operations.

library(tidyverse)
library(glue)
library(scales)
library(gt)
library(ggiraph)  # interactive plots

library(tidycensus)
library(tigris) # TIGER/Line shapefiles
library(sf)  # simple features for spatial analysis

Let’s get the CPC’s definition of neighborhoods. I went to the City of Cleveland Open Data web site and and navigated to their analysis of the 2020 US Census.8 There interactive map has five layers (screen capture below). The first is the shape file of the 34 neighborhoods (SPAs). The second file contains population data from the 2020 decennial census complete with census block, census tract, and SPA. I downloaded and unzipped the first two files. Now I have a way to map the SPA boundaries within Cleveland, and I have a mapping of census blocks to SPAs so I can join this to the US Census data.

Screen capture from Open Data

Screen capture from Open Data
# Nice contiguous shape file. One record for each of the 34 SPAs.
cleve_neigh_0 <-
  st_read(file.path(
  "inputs/Cleveland Neighborhoods",
  "Neighborhood_Population_Change.shp"
))

# Cleveland populated blocks. Includes block, tract, and SPA name.
cleve_blocks <- st_read(file.path(
  "inputs/Cleveland Populated Blocks 2020",
  "Decennial_2020_Populated_Blocks_Cleveland_Only.shp"
)) |>
  select(-starts_with("P0"), -starts_with("H0"))

I could just join cleve_neigh_0 to US Census Bureau data files by the geography elements using the sf package. I know exactly which blocks belong in each SPA for 2020, but block definitions change across censuses, so joining to cleve_neigh_0 will get me the 2000 and 2010 figures. The shape file may not be perfectly precise because I can’t quite match quoted population estimates for 2000 and 2010, but it’s close.

Load shape files from the tigris package to facilitate mapping. I’ll get the state, county boundaries, and a few cities. I also got the Terminal town coordinates from Google.

oh_state <- tigris::states(cb = TRUE) |> filter(STUSPS == "OH") 

oh_counties <- tigris::counties(cb = TRUE) |> filter(STUSPS == "OH")
cuya_county <- oh_counties |> filter(NAME == "Cuyahoga")

oh_places <- tigris::places("OH", year = 2022)
x <- c("Cleveland Heights", "Mayfield Heights")
my_places <- oh_places |> filter(NAME %in% x) |> st_centroid()

terminal_tower <- st_sfc(st_point(c(-81.69387, 41.49824)), crs = 4326)

Here is a plot of Cuyahoga County, Cleveland, and its 34 SPAs. Hover over the shapes to see their names. There’s Terminal Tower in the heart of downtown. Progressive field is a few blocks away, and 7.0 miles from my home in Cleveland Heights. Mayfield is where Progressive Insurance is headquartered. That’s where I work when I need to go into the office (I work from home).

Show the code
p <-
  ggplot() +
  geom_sf(data = oh_state, color = "gray60") +
  geom_sf_interactive(
    data = oh_counties, 
    aes(tooltip = NAME),
    fill = "honeydew", color = "gray90"
  ) +
  geom_sf(data = cuya_county, fill = "honeydew2", color = "gray80") +
  geom_sf_interactive(
    data = cleve_neigh_0,
    aes(tooltip = SPA_NAME),
    fill = "honeydew3", color = "honeydew4"
  ) +
  geom_sf(data = my_places, color = "honeydew3") +
  geom_sf(data = terminal_tower, color = "firebrick") +
  geom_sf_text(data = terminal_tower, aes(label = "Terminal Tower"),
               size = 3, hjust = .2, vjust = 1, color = "firebrick") +
  geom_sf_text(data = my_places, aes(label = NAME), size = 3, hjust = .2, vjust = 1) +
  coord_sf(xlim = c(-82.0, -81.3), ylim = c(41.25, 41.65)) +
  theme(
    panel.background = element_rect(fill = "skyblue"),
    panel.grid = element_blank(),
    axis.text = element_blank()
  ) +
  labs(
    x = NULL, y = NULL, 
    title = glue("Cleveland and Surrounding Cities, Cuyahoga County")
  )

girafe(ggobj = p)

Census Data

I don’t want to abuse the US Census Bureau API, so I’ll set a flag to only download data as I’m developing this script. Once I have what I want, I’ll keep my data on my local drive and build my report.

USE_API <- FALSE

The Census Bureau API allows you to select multiple variables from a single census file. There are a few files for each census, and the variable names change. I want the Cleveland area population in 2000, 2010, 2020, and the American Community Survey (ACS) 1-year estimate from 2023 (most recent). So despite the handiness of tidycensus package, data collection is still going to be a bit tedious.

The decennial census developer page lists the accessible datasets: 2000, 2010, and 2020. You need an API key from the Bureau before you can do anything. This is quick and easy: just click the “Request a KEY” tile in the menu at the left. The Census Bureau emails you a key. Best practice is to save the key in an .Renviron file.

usethis::edit_r_environ(scope = "project")

This opens (or creates) a .Renviron file in your project root. Add your key. The name is important: CENSUS_API_KEY. The tidycensus functions send that system variable (if you don’t explicitly supply it in the function). Set it like this:

CENSUS_API_KEY="abc123"

Now you can pull census data. I’ll start with 2020.

2020

The decennial census developer page has several data files for each census. Through trial and error, I discovered Redistricting Data (PL 94-171) contains overall population. There is a full list of variables that represent the various sub-groups of the population. I used it and the tidycensus::load_variables() function to identify the ones I want. I’ll include race/ethnicity to investigate demographic trends.

pl_2020_vars <-
  tidycensus::load_variables(2020, "pl") |>
  filter(
    between(name, "P2_001N", "P2_011N"),
    !name %in% c("P2_003N", "P2_004N")
  )

Here they are after a bit of cleaning.

Show the code
pl_2020_vars <- 
  pl_2020_vars |>
  mutate(
    label = case_when(
      str_detect(label, "White") ~ "White",
      str_detect(label, "Black") ~ "Black",
      str_detect(label, "Asian") ~ "Asian",
      str_detect(label, "American Indian") ~ "American Indian",
      str_detect(label, "Native Hawaiian") ~ "Pacific Islander",
      str_detect(label, "Some Other Race") ~ "Other",
      str_detect(label, "two or more races") ~ "Two or more races",
      str_detect(label, "Hispanic") ~ "Hispanic",
      str_detect(label, "Total") ~ "Total",
      TRUE ~ label
    ),
    rpt_group = if_else(name == "P2_001N", "Total", "Race/ethnicity"),
    rpt_level = if_else(
      label %in% c("White", "Black", "Hispanic", "Asian", "Total"),
      label, "Other")
  ) |>
  select(variable = name, label, rpt_group, rpt_level)

pl_2020_vars
# A tibble: 9 × 4
  variable label             rpt_group      rpt_level
  <chr>    <chr>             <chr>          <chr>    
1 P2_001N  Total             Total          Total    
2 P2_002N  Hispanic          Race/ethnicity Hispanic 
3 P2_005N  White             Race/ethnicity White    
4 P2_006N  Black             Race/ethnicity Black    
5 P2_007N  American Indian   Race/ethnicity Other    
6 P2_008N  Asian             Race/ethnicity Asian    
7 P2_009N  Pacific Islander  Race/ethnicity Other    
8 P2_010N  Other             Race/ethnicity Other    
9 P2_011N  Two or more races Race/ethnicity Other    

The Demographic Profile contains age data.

dp_2020_vars <- 
  tidycensus::load_variables(2020, "dp") |>
  filter(
    str_detect(label, "Count!!SEX AND AGE!!Total population"),
    !str_detect(label, "Selected Age Categories"),
    name != "DP1_0001C"
  )

I’ll ignore sex and aggregate the ages into ten-year buckets.

Show the code
dp_2020_vars <- 
  dp_2020_vars |>
  mutate(
    label = str_remove_all(label, "(Count!!SEX AND AGE!!Total population)|(!!)"),
    label = if_else(label == "", "Total", label),
    rpt_group = "Age",
    rpt_level = case_when(
      name <= "DP1_0004C" ~ "Under 15 yrs",
      name <= "DP1_0006C" ~ "15 to 24 yrs",
      name <= "DP1_0008C" ~ "25 to 34 yrs",
      name <= "DP1_0010C" ~ "35 to 44 yrs",
      name <= "DP1_0012C" ~ "45 to 54 yrs",
      name <= "DP1_0014C" ~ "55 to 64 yrs",
      TRUE ~ "65+ yrs"
    )
  ) |>
  select(variable = name, label, rpt_group, rpt_level)

dp_2020_vars
# A tibble: 18 × 4
   variable  label             rpt_group rpt_level   
   <chr>     <chr>             <chr>     <chr>       
 1 DP1_0002C Under 5 years     Age       Under 15 yrs
 2 DP1_0003C 5 to 9 years      Age       Under 15 yrs
 3 DP1_0004C 10 to 14 years    Age       Under 15 yrs
 4 DP1_0005C 15 to 19 years    Age       15 to 24 yrs
 5 DP1_0006C 20 to 24 years    Age       15 to 24 yrs
 6 DP1_0007C 25 to 29 years    Age       25 to 34 yrs
 7 DP1_0008C 30 to 34 years    Age       25 to 34 yrs
 8 DP1_0009C 35 to 39 years    Age       35 to 44 yrs
 9 DP1_0010C 40 to 44 years    Age       35 to 44 yrs
10 DP1_0011C 45 to 49 years    Age       45 to 54 yrs
11 DP1_0012C 50 to 54 years    Age       45 to 54 yrs
12 DP1_0013C 55 to 59 years    Age       55 to 64 yrs
13 DP1_0014C 60 to 64 years    Age       55 to 64 yrs
14 DP1_0015C 65 to 69 years    Age       65+ yrs     
15 DP1_0016C 70 to 74 years    Age       65+ yrs     
16 DP1_0017C 75 to 79 years    Age       65+ yrs     
17 DP1_0018C 80 to 84 years    Age       65+ yrs     
18 DP1_0019C 85 years and over Age       65+ yrs     

With the variables identified, request the data from the API. Cleveland is one of 59 subdivisions within Cuyahoga County. I’ll download the subdivision data to get a total count for Cuyahoga County and for Cleveland. Counties are composed of census tracts, and census tracts are composed of census blocks. Cities overlap census tracts, so I’ll download the block-level data and join to the cleve_blocks dataset from Cleveland Open Data. Urban Partners defined their Downtown Core and Westside areas by tract and some blocks from the Central SPA. I figured out which tracts and blocks by studying their map and swearing a lot.

Show the code
# Utility function to create factors
my_rpt_relevel <- function(x) {
  ethn <- c("Black", "White", "Hispanic", "Asian", "Other", "Total")
  x <- fct_relevel(x, ethn, after = Inf)
  x <- fct_relevel(x, "Under 15 yrs", after = 0)
  return(x)
}

# Urban Partners defn of Westside, uses tracts. The block-level data
# includes the tract number.
westside_tracts_2020 <- c(
  "103100", "103400", "103500", "103602", "103800", "103900", "104100", 
  "104200", "104300", "197800", "197700", "197500", "104400"
)

# Urban Partners defn of Downtown Core, uses tracts and blocks
downtown_core_tracts_2020 <- c(
  "103300", "107101", "107701", "107802", "109301")
downtown_core_blocks_2020 <- paste0(
  "39035108701", c("2001", "2004", "2006", "2008"))

if (USE_API) {

  subdiv_2020_pl <- 
    get_decennial( 
      geography = "county subdivision",
      sumfile = "pl",
      variables = pl_2020_vars$variable,
      state = "OH",
      county = "Cuyahoga",
      geometry = TRUE, 
      year = 2020
    ) |> 
    inner_join(pl_2020_vars, by = "variable") |>
    summarize(
      .by = c(GEOID, NAME, geometry, rpt_group, rpt_level),
      value = sum(value)
    )
  
  subdiv_2020_dp <- 
    get_decennial( 
      geography = "county subdivision",
      sumfile = "dp",
      variables = dp_2020_vars$variable,
      state = "OH",
      county = "Cuyahoga",
      geometry = TRUE, 
      year = 2020
    ) |> 
    inner_join(dp_2020_vars, by = "variable") |>
    summarize(
      .by = c(GEOID, NAME, geometry, rpt_group, rpt_level),
      value = sum(value)
    )
  
  # tract_2020_pl <- 
  #   get_decennial( 
  #     geography = "tract",
  #     sumfile = "pl",
  #     variables = pl_2020_vars$variable,
  #     state = "OH",
  #     county = "Cuyahoga",
  #     geometry = TRUE, 
  #     year = 2020
  #   ) |> 
  #   inner_join(pl_2020_vars, by = "variable") |>
  #   summarize(
  #     .by = c(GEOID, NAME, geometry, rpt_group, rpt_level),
  #     value = sum(value)
  #   )
  # 
  # tract_2020_dp <- 
  #   get_decennial( 
  #     geography = "tract",
  #     sumfile = "dp",
  #     variables = dp_2020_vars$variable,
  #     state = "OH",
  #     county = "Cuyahoga",
  #     geometry = TRUE, 
  #     year = 2020
  #   ) |>
  #   inner_join(dp_2020_vars, by = "variable") |>
  #   summarize(
  #     .by = c(GEOID, NAME, geometry, rpt_group, rpt_level),
  #     value = sum(value)
  #   )
  
  block_2020_pl <- 
    get_decennial( 
      geography = "block",
      sumfile = "pl",
      variables = pl_2020_vars$variable,
      state = "OH",
      county = "Cuyahoga",
      geometry = TRUE, 
      year = 2020
    ) |> 
    inner_join(pl_2020_vars, by = "variable") |>
    summarize(
      .by = c(GEOID, NAME, geometry, rpt_group, rpt_level),
      value = sum(value)
    )
  
  # dp is not available at the block level

  subdiv_2020 <- 
    bind_rows(subdiv_2020_pl, subdiv_2020_dp) |>
    mutate(
      rpt_level = my_rpt_relevel(rpt_level),
      NAME = str_remove_all(NAME, "(, Cuyahoga County, Ohio)|(village)|(city)"),
      NAME = str_trim(NAME)
    )
           
  tract_2020 <-
    bind_rows(tract_2020_pl, tract_2020_dp) |>
    mutate(rpt_level = my_rpt_relevel(rpt_level))
  
  block_2020 <-
    block_2020_pl |>
    inner_join(
      cleve_blocks |> as_tibble() |> select(GEOID20, SPA = SPA_NAME), 
      by = c("GEOID" = "GEOID20")
    ) |>
    mutate(
      rpt_level = my_rpt_relevel(rpt_level),
      greater_downtown = case_when(
        str_sub(GEOID, 6, 11) %in% westside_tracts_2020 ~ "Westside",
        str_sub(GEOID, 6, 11) %in% downtown_core_tracts ~ "Downtown Core",
        GEOID %in% downtown_core_blocks_2020 ~ "Downtown Core",
        TRUE ~ "Other"
      ),
      SPA = factor(str_to_title(SPA)),
      SPA = fct_relevel(SPA, "Downtown", after = 0),
      greater_downtown = factor(
        greater_downtown, levels = c("Downtown Core", "Westside", "Other"))
    )
  
  save(subdiv_2020, block_2020, file = "decennial_2020.Rdata")

} else {
  
  load("decennial_2020.Rdata")
  
}

The data is sliced three ways below. The top section of the table below is the subdivision data. Cuyahoga County has 1.3 million people with Cleveland’s at 372,624. The second section groups the block-level data by SPA. The Downtown SPA had 13,302 people. This matches the data table on Cleveland Open Data. The Downtown Core defined by Urban Partners, which included portions of the Central, Ohio City, and Cuyahoga Valley SPAs, had 18,708 people.

The map on the second tab shows Cuyahoga County and all of its subdivisions. Cleveland is the largest, and each of its SPAs are broken out. The Downtown SPA is highlighted. The Urban Partners extensions to downtown aren’t shown.

2020 Population Estimates for Cleveland and Vicinity
Population
Cuyahoga County Subdivisions
Cleveland 372,624
Other 892,193
Total 1,264,817
Cleveland Neighborhoods
Downtown 13,302
Bellaire-Puritas 13,823
Broadway-Slavic Village 19,022
Brooklyn Centre 8,315
Buckeye Shaker 11,419
Central 11,955
Clark Fulton 7,625
Collinwood Nottingham 9,616
Cudell 9,115
Cuyahoga Valley 1,293
Detroit-Shoreway 11,326
Edgewater 6,000
Euclid Green 5,051
Fairfax 5,167
Glenville 21,137
Goodrich-Kirtland Park 3,955
Hopkins 534
Hough 9,702
Jefferson 17,351
Kamms Corners 24,312
Kinsman 5,876
Lee-Harvard 9,770
Lee-Seville 4,171
Mount Pleasant 14,015
North Shore Collinwood 14,928
Ohio City 9,219
Old Brooklyn 32,315
Saint Clair-Superior 5,139
Stockyards 9,522
Tremont 7,798
Union-Miles Park 15,625
University Circle 9,620
West Boulevard 18,981
Woodland Hills 5,625
Total 372,624
Greater Downtown
Downtown Core 18,708
Westside 18,407
Other 335,509
Total 372,624

2010

Unfortunately, pulling 2010 and 2000 isn’t as simple as changing the year parameter in the API calls because they use a different file, Summary File 1.

sf1_2010_vars <-
  tidycensus::load_variables(2010, "sf1") |>
  filter(
    concept %in% c("HISPANIC OR LATINO ORIGIN BY RACE", "SEX BY AGE"),
    !name %in% c("P005002", "P012001", "P012002", "P012026"),
    !str_detect(label, "Total!!Hispanic or Latino!!"),
    !str_detect(name, "^PCT012")
  )

I’ll prepare the variables the same way as with 2020.

Show the code
sf1_2010_vars <-
  sf1_2010_vars |>
  mutate(
    label = str_remove(label, "(Total!!Male!!)|(Total!!Female!!)"),
    label = case_when(
      str_detect(label, "White") ~ "White",
      str_detect(label, "Black") ~ "Black",
      str_detect(label, "Asian") ~ "Asian",
      str_detect(label, "American Indian") ~ "American Indian",
      str_detect(label, "Native Hawaiian") ~ "Pacific Islander",
      str_detect(label, "Some Other Race") ~ "Other",
      str_detect(label, "Two or More Races") ~ "Two or more races",
      str_detect(label, "Hispanic") ~ "Hispanic",
      str_detect(label, "Total") ~ "Total",
      TRUE ~ label,
    ),
    rpt_group = case_when(
      name == "P005001" ~ "Total",
      between(name, "P005003", "P005010") ~ "Race/ethnicity",
      TRUE ~ "Age"
    ),
    rpt_level = case_when(
      label %in% c("White", "Black", "Hispanic", "Asian", "Total", "Other") ~ label,
      label %in% c("American Indian", "Pacific Islander", "Two or more races") ~ "Other",
      label %in% c("Under 5 years", "5 to 9 years", "10 to 14 years") ~ "Under 15 yrs",
      between(label, "15 to 17 years", "22 to 24 years") ~ "15 to 24 yrs",
      label %in% c("25 to 29 years", "30 to 34 years") ~ "25 to 34 yrs",
      label %in% c("35 to 39 years", "40 to 44 years") ~ "35 to 44 yrs",
      label %in% c("45 to 49 years", "50 to 54 years") ~ "45 to 54 yrs",
      between(label, "55 to 59 years", "62 to 64 years") ~ "55 to 64 yrs",
      between(label, "65 and 66 years", "85 years and over") ~ "65+ yrs"
    )
  ) |>
  select(variable = name, label, rpt_group, rpt_level)

sf1_2010_vars
# A tibble: 55 × 4
   variable label             rpt_group      rpt_level   
   <chr>    <chr>             <chr>          <chr>       
 1 P005001  Total             Total          Total       
 2 P005003  White             Race/ethnicity White       
 3 P005004  Black             Race/ethnicity Black       
 4 P005005  American Indian   Race/ethnicity Other       
 5 P005006  Asian             Race/ethnicity Asian       
 6 P005007  Pacific Islander  Race/ethnicity Other       
 7 P005008  Other             Race/ethnicity Other       
 8 P005009  Two or more races Race/ethnicity Other       
 9 P005010  Hispanic          Race/ethnicity Hispanic    
10 P012003  Under 5 years     Age            Under 15 yrs
# ℹ 45 more rows

Request the data from the API. This time I cannot join to cleve_blocks to get precise mappings of census blocks to SPAs. Instead, I’ll join to the cleve_neigh shape file to spatially join to the SPAs. This turns out to be almost as good, but not perfect.

Show the code
# Tract definition are same for 2010.
westside_tracts_2010 <- westside_tracts_2020

downtown_core_tracts_2010 <- c(
  "103300", "107101", "107701", "107802", "109301")
downtown_core_blocks_2010 <- 
  paste0("39035108701", c("3000", "3001", "3002", "3003", "3004"))

if (USE_API) {

  subdiv_2010 <- 
    get_decennial( 
      geography = "county subdivision",
      sumfile = "sf1",
      variables = sf1_2010_vars$variable,
      state = "OH",
      county = "Cuyahoga",
      geometry = TRUE, 
      year = 2010
    ) |> 
    inner_join(sf1_2010_vars, by = "variable") |>
    summarize(
      .by = c(GEOID, NAME, geometry, rpt_group, rpt_level),
      value = sum(value)
    ) |>
    mutate(
      rpt_level = my_rpt_relevel(rpt_level),
      NAME = str_remove_all(NAME, "(, Cuyahoga County, Ohio)|(village)|(city)"),
      NAME = str_trim(NAME)
    )
  
  # tract_2010 <- 
  #   get_decennial( 
  #     geography = "tract",
  #     sumfile = "sf1",
  #     variables = sf1_2010_vars$variable,
  #     state = "OH",
  #     county = "Cuyahoga",
  #     geometry = TRUE, 
  #     year = 2010
  #   ) |> 
  #   inner_join(sf1_2010_vars, by = "variable") |>
  #   summarize(
  #     .by = c(GEOID, NAME, geometry, rpt_group, rpt_level),
  #     value = sum(value)
  #   ) |>
  #   mutate(rpt_level = my_rpt_relevel(rpt_level))

  block_2010_0 <- 
    get_decennial( 
      geography = "block",
      sumfile = "sf1",
      variables = sf1_2010_vars$variable,
      state = "OH",
      county = "Cuyahoga",
      geometry = TRUE, 
      year = 2010
    ) |> 
    inner_join(sf1_2010_vars, by = "variable") |>
    summarize(
      .by = c(GEOID, NAME, geometry, rpt_group, rpt_level),
      value = sum(value)
    ) |>
    mutate(rpt_level = my_rpt_relevel(rpt_level))

  block_2010 <-
    st_join(cleve_neigh, st_centroid(block_2010_0), join = st_contains) |>
    mutate(
      rpt_level = my_rpt_relevel(rpt_level),
      greater_downtown = case_when(
        str_sub(GEOID, 6, 11) %in% westside_tracts_2010 ~ "Westside",
        str_sub(GEOID, 6, 11) %in% downtown_core_tracts_2010 ~ "Downtown Core",
        GEOID %in% downtown_core_blocks_2010 ~ "Downtown Core",
        TRUE ~ "Other"
      ),
      SPA = factor(str_to_title(SPA)),
      SPA = fct_relevel(SPA, "Downtown", after = 0),
      greater_downtown = factor(
        greater_downtown, levels = c("Downtown Core", "Westside", "Other"))
    ) |>
    select(GEOID, NAME, geometry, rpt_group, rpt_level, value, SPA, greater_downtown)
  
  save(subdiv_2010, block_2010, file = "decennial_2010.Rdata")

} else {
  
  load("decennial_2010.Rdata")
  
}

This time the sum of the neighborhoods, 395,601, doesn’t quite equal the city total, 396,815. There must be city blocks whose centers are not captured in the shapes in cleve_neigh. Comparing my values to those in the data table on Cleveland Open Data, the largest differences are in Euclid Green, Kamm’s Corners, and Hopkins. I haven’t thought of a good way to fix this, so I’m settling for “close enough”.

The Downtown SPA population of 9,464 does match the value reported in Cleveland Open Data. It was quite a bit lower than 2020 (13,302). My Downtown Core population of 15,156 is slightly different from Urban Partner’s value of 15,330. The Westside population does match though.

2010 Population Estimates for Cleveland and Vicinity
Population
Cuyahoga County Subdivisions
Cleveland 396,815
Other 883,307
Total 1,280,122
Cleveland Neighborhoods
Downtown 9,464
Bellaire-Puritas 13,380
Broadway-Slavic Village 22,331
Brooklyn Centre 8,948
Buckeye Shaker 12,470
Central 12,306
Clark Fulton 8,509
Collinwood Nottingham 11,542
Cudell 9,295
Cuyahoga Valley 1,378
Detroit-Shoreway 11,577
Edgewater 5,851
Euclid Green 4,873
Fairfax 6,239
Glenville 27,394
Goodrich-Kirtland Park 4,238
Hopkins 646
Hough 11,490
Jefferson 16,548
Kamms Corners 24,097
Kinsman 6,966
Lee-Harvard 10,326
Lee-Seville 4,477
Mount Pleasant 17,320
North Shore Collinwood 15,768
Ohio City 8,396
Old Brooklyn 32,009
Saint Clair-Superior 6,876
Stockyards 10,411
Tremont 7,975
Union-Miles Park 19,004
University Circle 7,939
West Boulevard 18,880
Woodland Hills 6,678
Total 395,601
Greater Downtown
Downtown Core 15,156
Westside 18,433
Other 362,012
Total 395,601

2000

2000 is similar to 2010 in that it uses Summary File 1.

sf1_2000_vars <-
  tidycensus::load_variables(2000, "sf1") |>
  filter(
    concept %in% c(
      "HISPANIC OR LATINO, AND NOT HISPANIC OR LATINO BY RACE [73]", 
      "SEX BY AGE [49]"
    ),
    !name %in% c("P004003", "P004004", "P012001", "P012002", "P012026"),
    !str_detect(label, "Population of two or more races!!"),
    !str_detect(name, "^PCT013")
  )

Same process: prepare the variables.

Show the code
sf1_2000_vars <-
  sf1_2000_vars |>
  mutate(
    label = str_remove(label, "(Total!!Male!!)|(Total!!Female!!)"),
    label = case_when(
      str_detect(label, "White") ~ "White",
      str_detect(label, "Black") ~ "Black",
      str_detect(label, "Asian") ~ "Asian",
      str_detect(label, "American Indian") ~ "American Indian",
      str_detect(label, "Native Hawaiian") ~ "Pacific Islander",
      str_detect(label, "Some Other Race") ~ "Other",
      str_detect(label, "Two or More Races") ~ "Two or more races",
      str_detect(label, "Hispanic") ~ "Hispanic",
      str_detect(label, "Total") ~ "Total",
      TRUE ~ label,
    ),
    rpt_group = case_when(
      name == "P004001" ~ "Total",
      between(name, "P004002", "P004011") ~ "Race/ethnicity",
      TRUE ~ "Age"
    ),
    rpt_level = case_when(
      label %in% c("White", "Black", "Hispanic", "Asian", "Total", "Other") ~ label,
      label %in% c("American Indian", "Pacific Islander", "Two or more races") ~ "Other",
      label %in% c("Under 5 years", "5 to 9 years", "10 to 14 years") ~ "Under 15 yrs",
      between(label, "15 to 17 years", "22 to 24 years") ~ "15 to 24 yrs",
      label %in% c("25 to 29 years", "30 to 34 years") ~ "25 to 34 yrs",
      label %in% c("35 to 39 years", "40 to 44 years") ~ "35 to 44 yrs",
      label %in% c("45 to 49 years", "50 to 54 years") ~ "45 to 54 yrs",
      between(label, "55 to 59 years", "62 to 64 years") ~ "55 to 64 yrs",
      between(label, "65 and 66 years", "85 years and over") ~ "65+ yrs"
    )
  ) |>
  select(variable = name, label, rpt_group, rpt_level)

sf1_2010_vars
# A tibble: 55 × 4
   variable label             rpt_group      rpt_level   
   <chr>    <chr>             <chr>          <chr>       
 1 P005001  Total             Total          Total       
 2 P005003  White             Race/ethnicity White       
 3 P005004  Black             Race/ethnicity Black       
 4 P005005  American Indian   Race/ethnicity Other       
 5 P005006  Asian             Race/ethnicity Asian       
 6 P005007  Pacific Islander  Race/ethnicity Other       
 7 P005008  Other             Race/ethnicity Other       
 8 P005009  Two or more races Race/ethnicity Other       
 9 P005010  Hispanic          Race/ethnicity Hispanic    
10 P012003  Under 5 years     Age            Under 15 yrs
# ℹ 45 more rows

Request the data from the API. I’ll used the cleve_neigh shape file again to identify the SPAs. Tract and block identifiers can change from census to census, so I had to make some changes to the Downtown Core definition. I used the same block identifiers for Urban Partners’ definitions, but they did not include 2020 in their report, so I’m not sure how much this differs.

Show the code
# Tract definition are same for 2000.
westside_tracts_2000 <- westside_tracts_2020

downtown_core_tracts_2000 <- c(
  "107100", "107200", "107300", "107400", "107500", "107600", "107700",
  "107800", "107900", "109200")

downtown_core_blocks_2000 <- 
  paste0("39035108701", c("3000", "3001", "3002", "3003", "3004"))

if (USE_API) {

  subdiv_2000_0 <- 
    get_decennial( 
      geography = "county subdivision",
      sumfile = "sf1",
      variables = sf1_2000_vars$variable,
      state = "OH",
      county = "Cuyahoga",
      geometry = FALSE, # no county subdivision geography in 2000
      year = 2000
    ) |> 
    inner_join(sf1_2000_vars, by = "variable") |>
    summarize(
      .by = c(GEOID, NAME, rpt_group, rpt_level),
      value = sum(value)
    ) |>
    mutate(
      rpt_level = my_rpt_relevel(rpt_level),
      NAME = str_remove_all(NAME, "(, Cuyahoga County, Ohio)|(village)|(city)"),
      NAME = str_trim(NAME)
    )
  
  # No geometry for 2000? No problem? I'll use the 2010 geometry and replace the 
  # values with 2000.
  subdiv_2000_1 <- 
    subdiv_2000_0 |> 
    as_tibble() |> 
    select(GEOID, rpt_group, rpt_level, value)
  
  subdiv_2000 <- 
    subdiv_2010 |>
    select(-value) |>
    inner_join(subdiv_2000_1, by = c("GEOID", "rpt_group", "rpt_level"))
  
  # tract_2000 <- 
  #   get_decennial( 
  #     geography = "tract",
  #     sumfile = "sf1",
  #     variables = sf1_2000_vars$variable,
  #     state = "OH",
  #     county = "Cuyahoga",
  #     geometry = TRUE, 
  #     year = 2000
  #   ) |> 
  #   inner_join(sf1_2000_vars, by = "variable") |>
  #   summarize(
  #     .by = c(GEOID, NAME, geometry, rpt_group, rpt_level),
  #     value = sum(value)
  #   ) |>
  #   mutate(rpt_level = my_rpt_relevel(rpt_level))

  block_2000_0 <- 
    get_decennial( 
      geography = "block",
      sumfile = "sf1",
      variables = sf1_2000_vars$variable,
      state = "OH",
      county = "Cuyahoga",
      geometry = TRUE, 
      year = 2000
    ) |> 
    inner_join(sf1_2000_vars, by = "variable") |>
    summarize(
      .by = c(GEOID, NAME, geometry, rpt_group, rpt_level),
      value = sum(value)
    ) |>
    mutate(rpt_level = my_rpt_relevel(rpt_level))

  block_2000 <-
    st_join(cleve_neigh, st_centroid(block_2000_0), join = st_contains) |>
    mutate(
      rpt_level = my_rpt_relevel(rpt_level),
      greater_downtown = case_when(
        str_sub(GEOID, 6, 11) %in% westside_tracts_2000 ~ "Westside",
        str_sub(GEOID, 6, 11) %in% downtown_core_tracts_2000 ~ "Downtown Core",
        GEOID %in% downtown_core_blocks_2000 ~ "Downtown Core",
        TRUE ~ "Other"
      ),
      SPA = factor(str_to_title(SPA)),
      SPA = fct_relevel(SPA, "Downtown", after = 0),
      greater_downtown = factor(
        greater_downtown, levels = c("Downtown Core", "Westside", "Other"))
    ) |>
    select(GEOID, NAME, geometry, rpt_group, rpt_level, value, SPA, greater_downtown)
  
  save(subdiv_2000, block_2000, file = "decennial_2000.Rdata")

} else {
  
  load("decennial_2000.Rdata")
  
}

As with 2010, the sum of the neighborhoods, 477,107, doesn’t quite match the city value, 478,403, but that is still pretty close. Wow, 478,403 people in 2000, that’s 100K more than 2020. On the other hand, only 6,310 people lived Downtown. The Downtown resurgence of does not seem to be a recent phenomena.

2000 Population Estimates for Cleveland and Vicinity
Population
Cuyahoga County Subdivisions
Cleveland 478,403
Other 915,575
Total 1,393,978
Cleveland Neighborhoods
Downtown 6,310
Bellaire-Puritas 14,520
Broadway-Slavic Village 30,652
Brooklyn Centre 10,155
Buckeye Shaker 16,063
Central 11,568
Clark Fulton 10,672
Collinwood Nottingham 15,874
Cudell 10,630
Cuyahoga Valley 1,307
Detroit-Shoreway 13,917
Edgewater 6,360
Euclid Green 6,169
Fairfax 8,447
Glenville 39,941
Goodrich-Kirtland Park 4,580
Hopkins 338
Hough 14,734
Jefferson 18,266
Kamms Corners 25,256
Kinsman 10,256
Lee-Harvard 11,665
Lee-Seville 5,595
Mount Pleasant 24,013
North Shore Collinwood 18,346
Ohio City 8,726
Old Brooklyn 34,169
Saint Clair-Superior 11,534
Stockyards 12,076
Tremont 9,317
Union-Miles Park 26,539
University Circle 9,386
West Boulevard 20,492
Woodland Hills 9,234
Total 477,107
Greater Downtown
Downtown Core 8,412
Westside 15,154
Other 453,541
Total 477,107

2023 (ACS)

The 2023 American Community Survey publishes a 1-year and 5-year average. The 1-year survey might be helpful, but it doesn’t have block-level data. I’ll download the subdivision file and check in on Cleveland as a whole.

acs1_2023_vars <-
  tidycensus::load_variables(2023, "acs1") |>
  filter(
    concept %in% c("Sex by Age", "Hispanic or Latino Origin by Race"),
    # between(name, "B01001_001E_001N", "P2_011N"),
    !name %in% c("B01001_002", "B01001_026", "B03002_001", "B03002_002",
                 "B03002_010", "B03002_011"),
    name <= "B03002_012"
  )

Same variable prep.

Show the code
acs1_2023_vars <-
  acs1_2023_vars |>
  mutate(
    label = str_remove_all(label, "(Estimate!!Total:!!)|(Male:!!)|(Female:!!)"),
    label = case_when(
      str_detect(label, "White") ~ "White",
      str_detect(label, "Black") ~ "Black",
      str_detect(label, "Asian") ~ "Asian",
      str_detect(label, "American Indian") ~ "American Indian",
      str_detect(label, "Native Hawaiian") ~ "Pacific Islander",
      str_detect(label, "Some other race") ~ "Other",
      str_detect(label, "Two or more races") ~ "Two or more races",
      str_detect(label, "Hispanic") ~ "Hispanic",
      str_detect(label, "Total") ~ "Total",
      TRUE ~ label,
    ),
    rpt_group = case_when(
      name == "B01001_001" ~ "Total",
      between(name, "B03002_003", "B03002_012") ~ "Race/ethnicity",
      TRUE ~ "Age"
    ),
    rpt_level = case_when(
      label %in% c("White", "Black", "Hispanic", "Asian", "Total", "Other") ~ label,
      label %in% c("American Indian", "Pacific Islander", "Two or more races") ~ "Other",
      label %in% c("Under 5 years", "5 to 9 years", "10 to 14 years") ~ "Under 15 yrs",
      between(label, "15 to 17 years", "22 to 24 years") ~ "15 to 24 yrs",
      label %in% c("25 to 29 years", "30 to 34 years") ~ "25 to 34 yrs",
      label %in% c("35 to 39 years", "40 to 44 years") ~ "35 to 44 yrs",
      label %in% c("45 to 49 years", "50 to 54 years") ~ "45 to 54 yrs",
      between(label, "55 to 59 years", "62 to 64 years") ~ "55 to 64 yrs",
      between(label, "65 and 66 years", "85 years and over") ~ "65+ yrs"
    )
  ) |>
  select(variable = name, label, rpt_group, rpt_level)

acs1_2023_vars
# A tibble: 55 × 4
   variable   label           rpt_group rpt_level   
   <chr>      <chr>           <chr>     <chr>       
 1 B01001_001 Total           Total     Total       
 2 B01001_003 Under 5 years   Age       Under 15 yrs
 3 B01001_004 5 to 9 years    Age       Under 15 yrs
 4 B01001_005 10 to 14 years  Age       Under 15 yrs
 5 B01001_006 15 to 17 years  Age       15 to 24 yrs
 6 B01001_007 18 and 19 years Age       15 to 24 yrs
 7 B01001_008 20 years        Age       15 to 24 yrs
 8 B01001_009 21 years        Age       15 to 24 yrs
 9 B01001_010 22 to 24 years  Age       15 to 24 yrs
10 B01001_011 25 to 29 years  Age       25 to 34 yrs
# ℹ 45 more rows

Request the data from the API.

Show the code
if (USE_API) {

  subdiv_2023 <- 
    get_acs( 
      geography = "county subdivision",
      sumfile = "acs1",
      variables = acs1_2023_vars$variable,
      state = "OH",
      county = "Cuyahoga",
      geometry = FALSE, # no geo file for ACS-1yr
      year = 2023
    ) |> 
    inner_join(acs1_2023_vars, by = "variable") |>
    summarize(
      .by = c(GEOID, NAME, rpt_group, rpt_level),
      value = sum(estimate)
    ) |>
    mutate(
      rpt_level = my_rpt_relevel(rpt_level),
      NAME = str_remove_all(NAME, "(, Cuyahoga County, Ohio)|(village)|(city)"),
      NAME = str_trim(NAME)
    )
  
  # tract_2023 <- 
  #   get_acs( 
  #     geography = "tract",
  #     sumfile = "acs1",
  #     variables = acs1_2023_vars$variable,
  #     state = "OH",
  #     county = "Cuyahoga",
  #     geometry = FALSE, 
  #     year = 2023
  #   ) |> 
  #   inner_join(acs1_2023_vars, by = "variable") |>
  #   summarize(
  #     .by = c(GEOID, NAME, rpt_group, rpt_level),
  #     value = sum(estimate)
  #   ) |>
  #   mutate(rpt_level = my_rpt_relevel(rpt_level))

  save(subdiv_2023, file = "acs1yr_2023.Rdata")

} else {
  
  load("acs1yr_2023.Rdata")
  
}

Cleveland’s population has continued to decline, down to 367,523 from 372,624 in 2020.

2023 Population Estimates for Cleveland and Vicinity
Population
Cleveland 367,523
Other 881,895
Total 1,249,418

Race/ethnicity

While Cleveland’s white and black populations have fallen, Asian, Hispanic, and other (including two or more races) have increased. The non-white populations have increased in the rest of Cuyahoga County, while white has fallen from 80% in 2000 to 67% in 2023.

Cuyahoga Race/ethnicity Population Change
2000 2010 2020 2023
Cleveland
Black 241,512 50% 208,208 52% 176,813 47% 169,138 46%
White 185,641 39% 132,710 33% 119,547 32% 124,183 34%
Hispanic 43,648 9% 39,534 10% 48,699 13% 47,132 13%
Asian 6,284 1% 7,213 2% 10,390 3% 8,356 2%
Other 1,318 0% 9,150 2% 17,175 5% 18,714 5%
Other Parts of Cuyahoga County
Black 137,885 15% 166,760 19% 188,356 21% 188,335 21%
White 732,936 80% 653,267 74% 599,206 67% 588,395 67%
Hispanic 24,924 3% 21,736 2% 34,628 4% 37,732 4%
Asian 18,735 2% 25,402 3% 33,349 4% 31,916 4%
Other 1,095 0% 16,142 2% 36,654 4% 35,517 4%
County Total
Total 1,393,978 1,280,122 1,264,817 1,249,418

The SPAs with historically high concentrations of black populations like Glenville, Hough, and Mount Pleasant, have experienced the greatest population declines. The more integrated SPAs like Downtown, Bellaire-Puritas, and Brooklyn Centre, have tended to retain their populations. These neighborhoods have become more diverse with rising Hispanic, Asian, and other populations.

Age

Unfortunately, there 2020 and 2023 ACS censuses do not provide block level data for age. We can still make some inferences from the higher-level summaries. The one trend that stands out is the decline in children under 15 in Cuyahoga County, and especially Cleveland. Older people aged 55 and up have held or even increased their numbers.

Cuyahoga Age Population Change
2000 2010 2020 2023
Cleveland
Under 15 yrs 117,101 24% 80,298 20% 67,636 18% 64,554 18%
15 to 24 yrs 64,556 13% 61,044 15% 50,100 13% 48,566 13%
25 to 34 yrs 71,847 15% 53,996 14% 62,334 17% 63,832 17%
35 to 44 yrs 73,822 15% 49,555 12% 43,901 12% 44,705 12%
45 to 54 yrs 55,111 12% 59,726 15% 42,857 12% 41,195 11%
55 to 64 yrs 35,987 8% 44,700 11% 51,614 14% 49,383 13%
65+ yrs 59,979 13% 47,496 12% 54,182 15% 55,288 15%
Other Parts of Cuyahoga County
Under 15 yrs 174,502 19% 154,662 18% 143,357 16% 147,603 17%
15 to 24 yrs 102,919 11% 107,421 12% 106,136 12% 100,684 11%
25 to 34 yrs 117,026 13% 103,990 12% 116,982 13% 114,935 13%
35 to 44 yrs 145,627 16% 109,318 12% 104,209 12% 107,483 12%
45 to 54 yrs 132,490 14% 137,460 16% 107,217 12% 105,247 12%
55 to 64 yrs 85,829 9% 119,411 14% 129,766 15% 123,563 14%
65+ yrs 157,182 17% 151,045 17% 184,526 21% 182,380 21%
County Total
Total 1,393,978 1,280,122 1,264,817 1,249,418

Footnotes

  1. “Opinion: Downtown Cleveland’s strategy to broaden appeal sees success”, Crains Cleveland Business. “Cleveland’s downtown population continues to surge”, Cleveland Fox 19 News.↩︎

  2. Downtown Cleveland Inc. commissioned a report, “Downtown Cleveland Market Study Report” (pdf), by the Urban Partners consulting firm. The report was released in Apr 2023. Figures are from Table 1: 15,330 people in 2010, 18,708 people in 2020 (22% increase).↩︎

  3. See the Downtown neighborhood (statistical processing area, SPA) in the data table.↩︎

  4. “There’s Still No Agreement on How Many Clevelanders Actually Live Downtown”, Cleveland Scene, Sep 17, 2024.↩︎

  5. Cleveland’s population plateaued around 1930 at 900K. The peak was 914K in the 1950 census. Between 1960 and 1980 the population declined by a third. The current population is slightly below the 1900 value. See Visual Cleveland at https://visual.clevelandhistory.org/census/.↩︎

  6. 362,670 +/- 62. https://data.census.gov/table/ACSST1Y2023.S0101?q=cleveland,%20oh↩︎

  7. Social Planning Areas (SPAs) were developed in the 1950s to coordinate social services at the neighborhood level. Learn more at the Encyclopedia of Cleveland History. Wikipedia has a nice explanation of how neighborhoods relate to Statistical (or social) Planning Areas.↩︎

  8. From https://data.clevelandohio.gov/, go to the Data Catalog and scroll to Census 2020 Analysis.↩︎